medical test
Enhancing LLMs for Identifying and Prioritizing Important Medical Jargons from Electronic Health Record Notes Utilizing Data Augmentation
Jang, Won Seok, Sultana, Sharmin, Yao, Zonghai, Tran, Hieu, Yang, Zhichao, Kwon, Sunjae, Yu, Hong
OpenNotes enables patients to access EHR notes, but medical jargon can hinder comprehension. To improve understanding, we evaluated closed- and open-source LLMs for extracting and prioritizing key medical terms using prompting, fine-tuning, and data augmentation. We assessed LLMs on 106 expert-annotated EHR notes, experimenting with (i) general vs. structured prompts, (ii) zero-shot vs. few-shot prompting, (iii) fine-tuning, and (iv) data augmentation. To enhance open-source models in low-resource settings, we used ChatGPT for data augmentation and applied ranking techniques. We incrementally increased the augmented dataset size (10 to 10,000) and conducted 5-fold cross-validation, reporting F1 score and Mean Reciprocal Rank (MRR). Our result show that fine-tuning and data augmentation improved performance over other strategies. GPT-4 Turbo achieved the highest F1 (0.433), while Mistral7B with data augmentation had the highest MRR (0.746). Open-source models, when fine-tuned or augmented, outperformed closed-source models. Notably, the best F1 and MRR scores did not always align. Few-shot prompting outperformed zero-shot in vanilla models, and structured prompts yielded different preferences across models. Fine-tuning improved zero-shot performance but sometimes degraded few-shot performance. Data augmentation performed comparably or better than other methods. Our evaluation highlights the effectiveness of prompting, fine-tuning, and data augmentation in improving model performance for medical jargon extraction in low-resource scenarios.
Exploring the Effectiveness of Instruction Tuning in Biomedical Language Processing
Rohanian, Omid, Nouriborji, Mohammadmahdi, Clifton, David A.
Large Language Models (LLMs), particularly those similar to ChatGPT, have significantly influenced the field of Natural Language Processing (NLP). While these models excel in general language tasks, their performance in domain-specific downstream tasks such as biomedical and clinical Named Entity Recognition (NER), Relation Extraction (RE), and Medical Natural Language Inference (NLI) is still evolving. In this context, our study investigates the potential of instruction tuning for biomedical language processing, applying this technique to two general LLMs of substantial scale. We present a comprehensive, instruction-based model trained on a dataset that consists of approximately $200,000$ instruction-focused samples. This dataset represents a carefully curated compilation of existing data, meticulously adapted and reformatted to align with the specific requirements of our instruction-based tasks. This initiative represents an important step in utilising such models to achieve results on par with specialised encoder-only models like BioBERT and BioClinicalBERT for various classical biomedical NLP tasks. Our work includes an analysis of the dataset's composition and its impact on model performance, providing insights into the intricacies of instruction tuning. By sharing our codes, models, and the distinctively assembled instruction-based dataset, we seek to encourage ongoing research and development in this area.
How to Stop the Elizabeth Holmes of A.I.
Elizabeth Holmes convinced investors and patients that she had a prototype of a microsampling machine that could run a wide range of relatively accurate tests using a fraction of the volume of blood usually required. She lied; the Edison and miniLab devices didn't work. Worse still, the company was aware they didn't work, but continued to give patients inaccurate information about their health, including telling healthy pregnant women that they were having miscarriages and producing false positives on cancer and HIV screenings. But Holmes, who has to report to prison by May 30, was convicted of defrauding investors; she wasn't convicted of defrauding patients. This is because the principles of ethics for disclosure to investors, and the legal mechanisms used to take action against fraudsters like Holmes, are well developed.
Why Most Introductory Examples of Bayesian Statistics Misrepresent It โ Towards AI
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. If you've ever come across material that introduces Bayesian Inference, you'll find that it usually involves an example of how misleading some medical testing devices can be in detecting diseases.
Epidemic mitigation by statistical inference from contact tracing data
Baker, Antoine, Biazzo, Indaco, Braunstein, Alfredo, Catania, Giovanni, Dall'Asta, Luca, Ingrosso, Alessandro, Krzakala, Florent, Mazza, Fabio, Mรฉzard, Marc, Muntoni, Anna Paola, Refinetti, Maria, Mannelli, Stefano Sarao, Zdeborovรก, Lenka
Contact-tracing is an essential tool in order to mitigate the impact of pandemic such as the COVID-19. In order to achieve efficient and scalable contact-tracing in real time, digital devices can play an important role. While a lot of attention has been paid to analyzing the privacy and ethical risks of the associated mobile applications, so far much less research has been devoted to optimizing their performance and assessing their impact on the mitigation of the epidemic. We develop Bayesian inference methods to estimate the risk that an individual is infected. This inference is based on the list of his recent contacts and their own risk levels, as well as personal information such as results of tests or presence of syndromes. We propose to use probabilistic risk estimation in order to optimize testing and quarantining strategies for the control of an epidemic. Our results show that in some range of epidemic spreading (typically when the manual tracing of all contacts of infected people becomes practically impossible, but before the fraction of infected people reaches the scale where a lockdown becomes unavoidable), this inference of individuals at risk could be an efficient way to mitigate the epidemic. Our approaches translate into fully distributed algorithms that only require communication between individuals who have recently been in contact. Such communication may be encrypted and anonymized and thus compatible with privacy preserving standards. We conclude that probabilistic risk estimation is capable to enhance performance of digital contact tracing and should be considered in the currently developed mobile applications. Identifying, calling, testing, and if needed quarantining the recent contacts of an individual who has just been tested positive is the standard route for limiting the transmission of a highly contagious virus.
South Korea winning the fight against coronavirus using big-data and AI
South Korea is fighting the novel coronavirus (COVID-19) by relying on its technological forte. The country has an advanced digital platform for big-data mining, along with artificial intelligence (AI) and Koreans are leading the technological front, with Samsung competing closely with Apple.Inc of USA. Utilising big-data analysis, AI-powered advance warning systems, and intensive observation methodology, South Korea has already managed to bring the coronavirus situation in the country under control in a short time. South Korea is using the analysis, information and references provided by this integrated data -- all different real-time responses and information produced by the platform are promptly conveyed to people with different AI-based applications. Whenever someone is tested positive for COVID-19, all the people in the vicinity are provided with the infected person's travel details, activities, and commute maps for the previous two weeks through mobile notifications sent as a push system.
Comparative Visual Analytics for Assessing Medical Records with Sequence Embedding
Guo, Rongchen, Fujiwara, Takanori, Li, Yiran, Lima, Kelly M., Sen, Soman, Tran, Nam K., Ma, Kwan-Liu
Machine learning for data-driven diagnosis has been actively studied in medicine to provide better healthcare. Supporting analysis of a patient cohort similar to a patient under treatment is a key task for clinicians to make decisions with high confidence. However, such analysis is not straightforward due to the characteristics of medical records: high dimensionality, irregularity in time, and sparsity. To address this challenge, we introduce a method for similarity calculation of medical records. Our method employs event and sequence embeddings. While we use an autoencoder for the event embedding, we apply its variant with the self-attention mechanism for the sequence embedding. Moreover, in order to better handle the irregularity of data, we enhance the self-attention mechanism with consideration of different time intervals. We have developed a visual analytics system to support comparative studies of patient records. To make a comparison of sequences with different lengths easier, our system incorporates a sequence alignment method. Through its interactive interface, the user can quickly identify patients of interest and conveniently review both the temporal and multivariate aspects of the patient records. We demonstrate the effectiveness of our design and system with case studies using a real-world dataset from the neonatal intensive care unit of UC Davis.
Channels' Confirmation and Predictions' Confirmation: from the Medical Test to the Raven Paradox
After long arguments between positivism and falsificationism, the verification of universal hypotheses was replaced with the confirmation of uncertain major premises. Unfortunately, Hemple discovered the Raven Paradox (RP). Then, Carnap used the logical probability increment as the confirmation measure. So far, many confirmation measures have been proposed. Measure F among them proposed by Kemeny and Oppenheim possesses symmetries and asymmetries proposed by Elles and Fitelson, monotonicity proposed by Greco et al., and normalizing property suggested by many researchers. Based on the semantic information theory, a measure b* similar to F is derived from the medical test. Like the likelihood ratio, b* and F can only indicate the quality of channels or the testing means instead of the quality of probability predictions. And, it is still not easy to use b*, F, or another measure to clarify the RP. For this reason, measure c* similar to the correct rate is derived. The c* has the simple form: (a-c)/max(a, c); it supports the Nicod Criterion and undermines the Equivalence Condition, and hence, can be used to eliminate the RP. Some examples are provided to show why it is difficult to use one of popular confirmation measures to eliminate the RP. Measure F, b*, and c* indicate that fewer counterexamples' existence is more essential than more positive examples' existence, and hence, are compatible with Popper's falsification thought.
Munich Re offers auto-underwriting expertise to the insurance market
Munich Re is offering auto-underwriting expertise and software that it claims can save mid-tier insurers time and money. Munich Re Automation Solutions, an insurtech specialist and part of Munich Re, says that ALLFINANZ SPARK, a software as a service (SaaS) solution, can help improve customer experience by automating and accelerating customer onboarding, and improve the rate of straight-through-processing. The company says the solution can also eliminate traditional methods including paperwork and repetitive health questions and intrusive medical tests, using predictive modelling and machine-learning algorithms to produce valuable datasets. "By offering auto-underwriting and analytics on a SaaS model we are removing many of the upfront costs and implementation barriers to entry that mid-tier firms face, while providing the full power of our enterprise application," says Declan O'Neill, EVP Product & Data at Munich Re Automation Solutions. "As a result, they can automate and accelerate customer onboarding, eliminate error-prone paperwork, repetitive questions and medical tests, and bring the power of analytical insights to their underwriting business. This means a more responsive, more agile business. Our customers who have already deployed SPARK on a SaaS basis have reported substantial improvements in client satisfaction as a result."
Effective Medical Test Suggestions Using Deep Reinforcement Learning
Chen, Yang-En, Tang, Kai-Fu, Peng, Yu-Shao, Chang, Edward Y.
Effective medical test suggestions benefit both patients and physicians to conserve time and improve diagnosis accuracy. In this work, we show that an agent can learn to suggest effective medical tests. We formulate the problem as a stage-wise Markov decision process and propose a reinforcement learning method to train the agent. We introduce a new representation of multiple action policy along with the training method of the proposed representation. Furthermore, a new exploration scheme is proposed to accelerate the learning of disease distributions. Our experimental results demonstrate that the accuracy of disease diagnosis can be significantly improved with good medical test suggestions.